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Abstract. In an e-Nash equilibrium, a player can gain at most e by 
changing his behaviour. Recent work has addressed the question of how 
best to compute e-Nash equilibria, and for what values of e a polynomial- 
time algorithm exists. An e- well-supported Nash equilibrium (e-WSNE) 
has the additional requirement that any strategy that is used with non- 
zero probability by a player must have payoff at most e less than the 
best response. A recent algorithm of Kontogiannis and Spirakis shows 
how to compute a 2/3-WSNE in polynomial time, for bimatrix games. 
Here we introduce a new technique that leads to an improvement to the 
worst-case approximation guarantee. 

1 Introduction 

The apparent hardness of computing an exact Nash equilibrium [3, 2] has led to 
work on algorithms for computing the weaker solution concept of approximate 
Nash equilibrium. In an e-Nash equilibrium, the criterion of "no incentive to 
deviate" is replaced by a weaker "low incentive to deviate": a player cannot 
improve his payoff more than some quantity e > by changing his behaviour. 
Two notions of approximate Nash equilibrium have been studied: approximate 
Nash equilibrium, and well-supported Nash equilibrium (WSNE). In this paper 
we study the problem of finding a WSNE. 

There has been relatively little work on computing a WSNE. The first result 
gave a | additive approximation [4], but this only holds if a certain a graph- 
theoretic conjecture is true. The best-known polynomial-time additive approx- 
imation algorithm was given by Kontogiannis and Spirakis, and achieves a |- 
approximation [7] . In [6] , which is an earlier conference version of [7] , the authors 
presented an algorithm that they claimed was polynomial-time and achieves a 
0-WSNE, where <fi = -^2 _ i ~ 0.6583, but this was later withdrawn, and in- 
stead the polynomial-time |-approximation algorithm was presented in [7]. It 
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has also been shown that there is a PTAS for well-supported approximate Nash 
equilibria if and only if there is a PTAS for approximate Nash equilibria [2] . The 
existence of a PTAS for approximate Nash equilibria is the main open problem 
in this line of work. 

In this paper, we give a polynomial-time algorithm that computes an e-WSNE 
with e < |. We do this by extending the |-WSNE algorithm of Kontogiannis 
and Spirakis. In particular, we show that either the strategies generated by their 
algorithm can be tweaked to improve the approximation, or that we can find a 
sub-game that resembles matching pennies, which again leads to a better approx- 
imation. This allows us to construct a (| — z)-WSNE in polynomial time, where 
z = 0.004735. This value of z is only a lower bound on the improvement over | 
that our algorithm achieves; we expect that our algorithm actually provides a 
better approximation guarantee in the worst case. 

2 Definitions 

A square bimatrix game is a pair (R, C) of two n x n matrices: the matrix R 
gives payoff values for the row player, and the matrix C gives payoff values for 
the column player. We will assume that all payoffs in R and C are in the range 
[0, 1]. We will use [n] = {1,2, . . .n} to denote the set of pure strategies in the 
game. To play the game, both players simultaneously select a pure strategy: the 
row player selects a row i G [n], and the column player selects a column j e [n] . 
The row player then receives a payoff of Ri,j, and the column player receives a 
payoff of Cij. 

A mixed strategy is a probability distribution over [n] . We will denote a mixed 
strategy as a vector x of length n, such that Xj is the probability that the pure 
strategy i is played. The support of a mixed strategy, denoted as Supp(x), is 
the set of pure strategies that are played with non-zero probability by x. If x is 
a mixed strategy for the row player, and y is a mixed strategy for the column 
player, then we call (x, y) a mixed strategy profile. 

Let y be a mixed strategy for the column player. The best responses against 
y for the row player is the set of pure strategies that maximize the payoff against 
y. More formally, a pure strategy i <E [n] is a best response against y if, for all 
pure strategies i' e [n] we have: 

je[n] je[n] 

Best responses for the column player are defined analogously. A mixed strategy 
profile (x, y) is a mixed Nash equilibrium if every pure strategy in Supp(x) is a 
best response against y, and every pure strategy in Supp(y) is a best response 
against x. Nash's theorem [8] asserts that every bimatrix game has a mixed Nash 
equilibrium. 

An approximate well-supported Nash equilibrium is defined by weakening the 
requirements of a mixed Nash equilibrium. For a mixed strategy y of the column 
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player, a pure strategy i £ [n] is an e-best response for the row player if, for all 
pure strategies i' <E [n] we have: 

j£[n] j€[n] 

We define e-best responses for the column player analogously. A mixed strat- 
egy profile (x, y) is an e-well- supported Nash equilibrium (WSNE) if every pure 
strategy in Supp(x) is an e-best response against y, and every pure strategy in 
Supp(y) is an e-best response against x. 

Note that, in comparison to a Nash equilibrium, where all strategies are 
required to be best responses, in an e-WSNE we are allowed to use strategies 
that are not best responses, as long as their payoff is within e of an actual best 
response. We define the row player's regret in a mixed strategy profile (x, y) to 
be the difference between the payoff obtained by playing a best response against 
y, and the lowest payoff strategy in Supp(x). So (x, y) is an e-WSNE if and only 
if both players have regret of e or lower. 

3 Outline 

Our work is based on the algorithm of Kontogiannis and Spirakis [7], which 
finds a |-WSNE. We begin by describing their algorithm. Let (R, C) be a n x n 
bimatrix game. The KS algorithm begins by checking if there exists any i and j 
such that Rij > | and C\j > |. If such a pair exists, then we have a |-WSNE 
in which the row player plays the pure strategy i, and the column player plays 
the pure strategy j. 

Otherwise, it proceeds by constructing a zero-sum game (D, —D), where: 

D=\{R-C). 

They then proved that the min-max strategies for this game are in fact a |- 
WSNE in the bimatrix game (R,C). Moreover, since zero-sum games can be 
solved in polynomial time, this gives a polynomial time algorithm for finding a 
|-WSNE. 

Theorem 1 ([7]). The KS algorithm computes a ^-WSNE in polynomial time. 

It is not difficult to find examples for which the bound given in Theorem 1 
is tight. Figure fa gives a bimatrix game (R, C) where this is the case. Strictly 
speaking, this bimatrix game should have been eliminated by the pre-processing 
step, because there are two pairs (i, j) with Rij > i, and Cij > |. However, this 
issue can be solved by replacing every instance of | with 3 — e, for some e > 0. 
This gives a (§ - |)-WSNE instead of a |-WSNE. For the sake of exposition, 
however, we will keep the ^ payoffs as they are. 

Figure lb shows the zero-sum game (D, —D), where D = — C). It can 
be seen that, if the row player plays the pure strategy B, then the column player 
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(a) The bimatrix game. 
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(b) The corresponding 
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Fig. 1: A 2 x 2 bimatrix game for which the algorithm of Kontogiannis and 
Spirakis produces a |-WSNE. 



gets payoff for both I and r. On the other hand, if the column player plays / 
with probability 0.5, and r with probability 0.5, then the row player gets payoff 
for both T and B. These two strategies are min-max strategies for (D, —D). 

Let us now consider the outcome when these two strategies are played in 
(R,C). Since the column player uses a uniform distribution over I and r, the 
payoff for the row player of playing T is |. However, the row player's strategy 
uses B, and thus achieves a payoff of 0. Therefore, the regret suffered by the row 
player is |, and this pair of strategies is a |-WSNE. 

Our approach is to take the strategies provided by the KS algorithm, and to 
improve them. In the case given in Figure 1, we can change the column player's 
strategy y to improve the row player's regret. Our aim is to choose the probability 
distribution over I and r that minimizes the regret for the row player. This can 
be achieved by taking all of the probability assigned to r, and moving it to I. 
Doing this will reduce the row player's payoff for playing T to i, and thus will 
allow us to produce (I, B), which is a i-WSNE, as opposed to the |-WSNE that 
we began with. In our final algorithm, we will also modify x in order to improve 
the column player's regret, but this has no effect in this example. 

However, it is not always possible to improve the strategy given by the KS 
algorithm. Figure 2 gives such an example. It can be seen that, when the row 
player plays B, and the column player mixes uniformly between / and r, then 
we have a min-max strategy pair for the zero-sum game shown in Figure 2b. 
This once again gives us a |-WSNE in the original bimatrix game. We cannot 
improve this by rearranging the probability on I and r: if we attempt to put 
more probability on I, then the payoff of T rises, and if we attempt to put more 
probability on r, then the payoff of M rises. 

This problem can be solved by noting that the 2x2 sub-matrix induced 
by T, M, I, and r resembles a matching pennies game. Furthermore, if the row 
player mixes uniformly over T and M, and the column player mixes uniformly 
over I and r, then the payoff is | for all of T, M, I, and r. Therefore, we have a 
i-WSNE. 
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Fig. 2: A bimatrix where our improvement procedure does not work. 



The rest of this paper is dedicated to showing that these ideas can be applied 
to find an e-WSNE with e < §. 

4 Our algorithm 

In this section we describe our algorithm for finding an e-WSNE, with e < |. 
This section is split into two parts: first we describe a pair of linear programs 
that can be used to find the best WSNE over a given pair of supports, then we 
use these LPs to define our full algorithm. 

4.1 Finding the best WSNE on a given pair of supports 

Suppose that we are given a support S r for the row player, and a support S c 
for the column player. Suppose that the row player is forced to play strategies 
x with Supp(x) = S r , and that the column player is forced to player strategies 
y with Supp(y) = S c . Let e be the best possible approximation guarantee that 
can be obtained by a WSNE when the players are restricted in this way. In this 
section we give an algorithm for finding a e'-WSNE such that e' < e. 

We define two linear programs, one for each player. The linear program for the 
column player computes a mixed strategy y' that is restricted so that Supp(y') C 
S c . It minimizes the regret of the row player under the assumption that the row 
player's support is S r . We get x' from a analogous linear program for the row 
player. We show that the mixed strategy profile (x', y') is our desired e'-WSNE. 
Since Supp(y') could be a strict subset of S c , or Supp(x') could be a strict subset 
of S r , we may have e' < e. 

We begin by defining the linear program the column player. It takes supports 
S c and S r as parameters. 
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Definition 2. We define the following linear program, where y' is a mixed strat- 
egy for the column player: 

Minimize: e 

Subject to: Ry ■ y' - R4 ■ y' < e i G S r , i' G [n] (1) 

y', =0 j i s c (2) 

The purpose of Constraint (2) is to restrict y' to only play columns in the 
support S c . Suppose that x is a mixed strategy for the row player with Supp(x) = 
S r - Constraint (1) says that, for every row i G S r , and every row i' G [n], the 
difference between Ry ■ y' and Ri ■ y' must be less than or equal to e. Therefore, 
for every mixed strategy x of the row player with Supp(x) = S ri we will have 
that x is an e-best response to y'. 

We also give an analogous linear program for the row player. Again, this 
linear program takes the supports S c and S r as parameters. 

Definition 3. We define the following linear program, where x' is a mixed strat- 
egy for the row player: 

Minimize: e 

Subject to: Cj, ■ x' - Cj ■ x' < e j G S c , j' G [n] (3) 

x'i = i $ S r (4) 

The solutions to these LPs allow us to find a well-supported Nash equilibrium. 
Let (x*,e x ) be a solution of the LP given in Definition 3 with parameters S r 
and S c . Similarly, let (y*, e y ) be a solution of the LP given in Definition 2 with 
parameters S r and S c . We define e* to be max(e x , e y ). We can show that (x*,y*) 
is an e*-WSNE. 

Proposition 4. (x*,y*) is an e*-WSNE. 

Proof. Since y* is a solution of the LP given in Definition 2, Constraint (1) 
implies that 

Ri> - y* -Ri-y* <e y , 

for every row i G Supp(x*), and every row i' G [n]. Therefore, x* is an e y -best 
response against y*. 

Similarly, since x* is a solution of the LP given in Definition 3, Constraint (3) 
implies that 

Cj, ■ x* - Cj ■ x* < e x , 

for every column j G Supp(y*), and every column j' G [n]. Therefore, y* is an 
e x -best response against x*. Thus, we have that (x*,y*) is an e*-WSNE. □ 
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The most important property, and the main result of this subsection, is that 
(x*,y*) is at least as good, or better than, all well-supported Nash equilibria 
with support S c and S r . We have the following proposition. 

Proposition 5. For every e-WSNE (x, y) with Supp(x) = S r and Supp(y) = 
S c , we have e* < e. 

Proof. Since Supp(y) = S r , we know that y satisfies the constraints given by (f ). 
Moreover, since x is an e-best response to y, we must have, for every row i G 
Supp(x), and every row i' e [n\: 

Ri> ■ y - Ri ■ y < e- 

This implies that (y, e) is feasible in the LP given by Definition 2, which implies 
that e > e y . 

Similarly, since Supp(x) = S r , we know that x satisfies the constraints given 
by (3). Moreover, since y is an e-best response to x, we must have, for every 
column j E Supp(y), and every column j' e [n]: 

Cj, ■ x - Cj ■ x < e. 

This implies that (x, e) is feasible in the LP given by Definition 3, which implies 
that e > e x . 

Since e > e y and e > e x , we must have e > max(e y , e x ) = e*. □ 

4.2 Finding a well-supported Nash equilibrium 

Our algorithm consists of three distinct procedures. 

(1) Find the best pure WSNE. In this procedure, we find the best WSNE 
when the players are restricted to using pure strategies. The KS algorithm 
does a preprocessing step in which all games with a pure |-WSNE are elim- 
inated. This procedure is a generalisation of that step. Suppose that the row 
player plays row i, and that the column player plays column j. Let 

e r = max(i?i< j) — Ri j, e c = max(Ci y ) — C% j. 

v ' j' 

Clearly, we have that i is an e r -best response against j, and that j is an 
e c -best response against i. Therefore, is a max(e r , e c )-WSNE, and this 
is the best possible WSNE using pure strategies i and j. Therefore, we can 
find the best pure WSNE by enumerating over all 0(n 2 ) possible pairs of 
pure strategies. Let e p be the best approximation guarantee that is found 
during this procedure. 

(2) Find the best WSNE with 2x2 support. In this procedure, we find 
the best possible WSNE when we assume that both players use a support of 
size 2. Recall from Figure 2 that, if we cannot improve the strategies from 
the KS algorithm, then we want to find a matching pennies sub-game. This 



7 



procedure is a generalisation of that idea, because every matching pennies 
subgame is a WSNE with 2x2 support. 

We can use linear programs from Definitions 2 and 3 to implement this 
procedure. For each of the 0(n 4 ) possible 2x2 supports, we solve the LPs 
to find a WSNE. Proposition 5 implies that this WSNE is at least as good 
as the best possible WSNE using those supports. In particular, this WSNE 
is at least as good as any matching pennies sub-game on these supports. Let 
e m be the best approximation guarantee that is found during this procedure. 
(3) Find an improvement over the KS algorithm. Recall from Figure 1 
that we want to improve the WSNE returned from the KS algorithm by 
rearranging the probabilities assigned by the two strategies. Suppose that 
the KS algorithm produces the mixed strategy pair (x, y). We will find the 
best possible WSNE over the supports Supp(x) and Supp(y). Again, this can 
be implemented using the linear programs from Definitions 2 and 3 for the 
supports Supp(x) and Supp(y). Let (x*,y*) be the mixed strategy profile 
returned by the LPs, and let u be the smallest value such that (x*,y*) is a 
e,-WSNE. ' 

After executing each of these procedures, we select the smallest among e p , 
e m , and e,, and return the corresponding well-supported Nash equilibrium. 

5 Roadmap for our proof 

Our goal is to show that our algorithm finds a (| — z)-WSNE, for some constant 
z > 0. The precise value of z will be determined during our proof, so at the start 
of the proof we treat z as a parameter. 

Recall that our algorithm finds three distinct WSNEs: we have e p , which 
corresponds to the best pure WSNE found by Procedure (1), we have e m , which 
corresponds to the best 2x2 WSNE found by Procedure (2), and we have ej, 
which corresponds to the improvement of the KS algorithm's WSNE in Proce- 
dure (3). In our proof, we will show that if e p > | — z, and if e m > | — z, then we 
must have < | — z. Therefore, our algorithm will always find a (| — £)-WSNE. 

Suppose that the KS algorithm outputs the mixed strategy profile (x, y). The 
goal of our proof is to produce a mixed strategy profile (x', y'), with Supp(x') = 
Supp(x) and Supp(y') = Supp(y), such that (x',y') is a (§ - 2;)-WSNE. Since 
Proposition 5 implies that Procedure (3) will find a e-WSNE that is at least as 
good as (x',y'), this will complete our proof. 

The first step of our proof is to generalise the analysis performed by Konto- 
giannis and Spirakis. They showed, under the assumption that there is no pure 
|-WSNE, that their algorithm produces a |-WSNE. However, Procedure (1) of 
our algorithm only eliminates the case where there is a pure (| — z)-WSNE. 
As this is a weaker assumption, the analysis of Kontogiannis and Spirakis no 
longer applies. Therefore, in Section 6, we perform the analysis with our new 
assumption, and we show that, if there is no pure (| — z)-WSNE, then the KS 
algorithm produces a (| + 2z)-WSNE. 
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In our proof, we will focus on how the mixed strategy y' can be constructed 
from y. However, all of our arguments can be applied symmetrically in order 
to construct x' from x. In Section 7, we take the strategy y that was returned 
by the KS algorithm, and use it define a strategy y m P. Then, we define y' to 
be a convex combination of y and y tm P . Formally, we define y' = y(i), where 
t e [0,1], and: 

y(t) := (l-t)-y + t-y im P. 

For the rest of our proof, we are concerned with finding a value of z for which 
the following property holds. 

Definition 6. We say that property P(z), which is parametrized by z, is true if 
there exists a value oft such that, for all row player strategies x' with Supp(x') = 
Supp(x), x' is a (| — z)-best response against y(t). 

If P(z) holds then our algorithm produces a (| — z)-WSNE for all games. Thus, 
we would like to find the largest value of z for which we can prove that that 
P(z) holds. 

In this paper, we develop a test that proves that P{z) holds for a restricted 
range of z. In more detail, if the test is passed then P(z) holds. However, we 
do not prove that, if the test is failed, then P(z) does not hold. In Sections 8 
through 13, we develop this test. 

In Sections 8, 9, and 10, we develop a simple linear program that forms 
the basis of our test. This linear program captures all possible input games 
that do not have a pure (| — z)-WSNE, because such solutions are found by 
Procedure (1). In Section 11, we observe that, if the game does not contain 
a matching pennies sub-game, then the linear program can be strengthened. 
Therefore, we use the fact that Procedure (2) eliminates all matching pennies 
sub-games to obtain a stronger linear program. In Sections 12 and 13, we show 
how the solutions of our linear program can be used for the test of P(z). 

Our test is monotone in z. To complete our proof, we use binary search to 
find the largest z for which the test tells us that P(z) holds. We find that the 
test is passed when z = 0.004735, but failed when z — 0.004736. Thus, we can 
state our main result. 

Theorem 7. The algorithm given in Section 4.2 finds a (§ - 0.004735)- WSNE. 

6 Modifying the KS algorithm 

Our objective is to use the KS algorithm to find a (| — z)-WSNE for some 
constant z > 0. However, to do this, we must make some modifications. The KS 
algorithm uses a preprocessing step to remove all games in which there is a pair 
of pure strategies such that Rij > 3 and dj > 3. This was a valid step 
because if such an (i,j) exists, then when both players can play these strategies, 
their regret can be at most |, and hence we have a |-WSNE. However, our 
assumption is only that e p > | — z, where e p is the WSNE found by Procedure 
(1), and so the original analysis does not hold. 
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Note that if there is a pure strategy profile such that Rij > ^ + z and 

Ci,j > 5 + z, then (i, j) is a | — z-WSNE. Therefore, we will use the fact that 
e p > | — z to conclude that there cannot be a pair of pure strategies with that 
property. Since all payoffs in R and C lie in the range [0, 1], this implies, for all 
i and j: 

< R id + C itj <\ + z - (5) 

In the rest of this section, we will carry out an analysis in the context of this 
new assumption. 

Recall that, in order to find a WSNE, the KS algorithm solved a zero-sum 
game (D, —D) where D = — C). Suppose that we solve the game, and that 
we obtain a mixed strategy profile (x, y). If (x, y) happens to be a (| — z)-WSNE, 
then we can stop, and output (x, y). Otherwise, at least one of the players has 
regret larger than | — z. We will suppose that this is the row player, and we will 
provide proofs for this scenario. However, all of our techniques can be applied 
symmetrically to the column player. 

Recall the worst-case example that was presented in Figure 1. There we saw 
an instance where the row player had regret |, because there was a row the 
support with payoff 0, and a row outside the support with payoff |. We will 
show that, if the row player has regret larger than | — z in (x, y), then the game 
must necessarily be similar to the example of Figure 1. We begin by showing 
that there must be a row in the support of x with payoff close to 0. 

Proposition 8. // (x,y) is a solution of (£>, — D) such that the row player has 
regret larger than | — z when (x, y) is played in (R,C), then there is a row 
i E Supp(x) such that both of the following hold: 

Ri-y<3z, C t ■ y < 3z. 

Proof. We begin by noting that, since D = \{R + C), if X = -\{R + C), then 
we have two equalities: 

R=D-X C=-D-X 

Since x is a min-max strategy in (D, —D), if i is a row in Supp(x), then for all 
rows i' we have: 

A • y > D v ■ y, 
{R + X) t -y > (R + X)i, -y, 

Ri-y>Ri>-y-(Xi-Xi,)-y. 

Since the row player has regret larger than | — z, when (x, y) is played in 
(R,C), there must be a pair of rows i,i' with i 6 Supp(x), and i' ^ Supp(x) 
such that: 

2 

Ri' - y - (-- z) > Ri- y, 
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Hence, we have: 

(X i -X i ,)-y>{ 2 --z). 

Note that, by Equation 5, all entries of X must lie in the range [— | — ^z, 0]. In 
particular, this implies that: 

2 1 2 

--- -z < X v ■ y < X, ■ y - {-- z). 

This implies that — |z < X^ ■ y < 0. Now, using the definition of X we obtain: 

-\{R + C) i -y>-\z, 

which is equivalent to: 

(R + C)i ■ y < Zz. 

Since both R and C are non-negative, we have completed the proof. □ 

The other feature of the example given in Figure 1 is that there is a row 
i' Supp(x) in which both Ri> ■ y = | and Cy ■ y = |. The next proposition 
shows that, whenever the algorithm produces a strategy profile that is not a 
(| — z)-WSNE, then such a row must always exist. We do this by showing that 
Ri' ■ y — Ci< ■ y < 3z holds for all rows i' . 

In this proposition we will also show that Ry ■ y < | + 2z for all rows i'. This 
implies that, with our modified assumptions, the KS algorithm will compute a 
(§ + 2z)-WSNE. In the following sections we will show how this (§ + 2z)-WSNE 
can be improved to a (| — z)-WSNE. 

Proposition 9. // (x,y) is a solution of (D, —D) such that the row player has 
regret larger than | — z when (x, y) is played in (R, C), then for all rows i' both 
of the following hold: 

Rvy<l i + 2z, R v ■ y - C v ■ y < 3z. 

Proof. Let i be the row in Supp(x) whose existence is implied by Proposition 8. 
This proposition, along with the fact that all entries in R and C are non-negative, 
implies that: 

< Ri ■ y < 3z, < d ■ y < 3z. 

By definition we have D = — C), and therefore, we have: 

-\z<D i -y<\z. 

Now, since x is a min-max strategy for the zero-sum game (D, —D), we must 
have, for all rows i 1 : 

3 

A' • y < A ■ y < -^ z - 
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Thus, we have: 

1 3 

-{Ri, -C v )-y < -z 

Rearranging this yields one of our two conclusions: 

Ri>-y<C v -y + 3z. (6) 
We can obtain the other conclusion by rearranging Equation (5) as follows: 

4 

Then, Equation (6) implies that: 

4 

Ri' ■ y < C v ■ y + 3z < - + Az - Rij 

This implies that 2 • R^ y < | + 4z, and so we have Ry ■ y < | + 2z. □ 

Since wc do not have a (| — z)-WSNE, we must have a row i such that the 
payoff of the row satisfies Ri ■ y > 3 — z. On the other hand, Proposition 9, 
part 1, implies that that the payoff of each row must also satisfy Ri ■ y < | + 2z. 
In order to find a (| — z)-WSNE, we must ensure that every row whose payoff 
lies in this range is improved. Note that a row whose payoff is | + 2z must be 
improved more than a row whose payoff is |, and therefore our techniques should 
be able to differentiate between the two. It is for this reason that we introduce 
the notion of a q-bad row. 

Definition 10. A row i is q-bad if: 

2 

Ri ■ y = - + 2z - qz. 

Let i be a q-bad row. We can apply the second inequality of Proposition 9 
to obtain the following: 

Ci-y>\-z-qz. (7) 

This adds further evidence in support of the claim that, whenever the zero-sum 
game solution is not a (| — z)-WSNE, the game must looks similar to the one 
shown in Figure 1. In that example we have a row i such that R4 ■ y = | and 
Ci ■ y = |. Here we have shown that this is a general property: since there must 
be a g-bad row i with q < 3, that row must have Ci ■ y > | — Az. 



7 A specific improvement y im P 

Our approach is to take the strategy y that was found in the previous section, 
and to improve it. To do this, we fix % to be the index of a worst bad row. More 
precisely, let % £ argmax^^ • y), thus 1 is a g-bad row such that there is no 
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q-b&d row with q < q. We fix i and q to be these choices throughout the rest of 
this paper. Since we are assuming that (x, y) is not a (| — z)-WSNE, we know 
that | + 2z — qz > | — z. This implies that q < 3. 

If we consider the example shown in Figure 1, then we see that the columns 
of the first row can be split into two types: columns in which the row player 
has a large payoff, and columns in which the column player has a large payoff. 
Building on this observation, we split the columns of each row i into three sets. 
We define the set Bi of big columns, and the set Si of small columns, as follows: 

B l = {j : Rij + 2z}, 

Si = {j : C id + 2z}. 
Finally, we have the set of other columns 

Oj = {l,2,...,n}\(fliUSj). 

which contains all columns that are neither big nor small. 

We aim to make row i less attractive to the row player by moving the prob- 
ability assigned to the columns in to the columns in Sj. This is analogous 
to shifting probability from the first column to the second column in Figure 1. 
Formally, we define the strategy y lm P^ for each j with 1 < j < n, as: 

[o ifjeSy, 

imp _ I , yj-E ; g Bi yj -r ■ c 

I yj otherwise. 

It will certainly be the case that, for the row i, we will have Ri-y lmp < Rj-y. 
However, this may not hold for the other rows in the game: the payoff of other 
rows may not decrease as fast as T, or the payoff may even increase. It is for this 
reason that we do not suggest jumping directly to the strategy y lm P ; but instead 
we propose that y should be gradually improved towards y lm P. More formally, 
for the parameter t G [0, 1], we define the strategy y(t) to be a mix of y and 

yimp 

y(t) :=(l-t)-y + t-y im ". (8) 

Recall that, in Definition 6, we are interested in values oft such that Ri-y(t) < 
| — z, for all rows i. This means that all q-b&d rows with q < 3 must improve so 
that their payoff is below | — z, the g-bad rows with q = 3 may not get worse, 
and the q-bad rows with q > 3 may get worse, but must still remain below \ —z. 
In the rest of this paper, we give an algorithm that decides whether this is the 
case. 

8 The structure of a q-bad row 

We begin our proof by studying the structure of each q-b&d row i. In particular, 
we want to show bounds on the amount of probability that y can assign to Bi, 
Si, and Oi. 
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We begin by considering the columns in Oi. The first thing that we note is 
that, if a column j is in Oi, then Ri j + d j must be significantly smaller than 
l + z. 

Proposition 11. For each row i, and each column j G Oi, we have Rij + d.j < 
l + 3z. 

Proof. For each column j G Oi we have both of the following properties: 

— Since j ^ Bi, we have Ri.j < I + 2z. 

— Since j ^ Si, we have d,j < § + 2z. 

Furthermore, our assumption that Procedure (1) does not find a pure (| — z)- 
WSNE implies that: 

- If Rlj > i + z, then d,j < ± + z. 

- If d,j > | + z, then Rij < | + z. 

This is because, if these inequalities did not hold for some pair i and j, then it 
is easy to show that is a (| — z)-WSNE. From these properties it is easy 
to see that R {J + d,j < | + 2z + | + z = 1 + 3z. □ 

We now want to argue that our g-bad row i does not assign much probability 
to Oj. Recall that Proposition 9, and Equation (7) give the following two facts: 

2 

Ri-y=-+2z-qz, 
2 

Ci-y > -- z - qz. 

Moreover, our assumptions ensure that there is never a pair i and j such that 
Rij > i + z and Ci.j > \ +z. Therefore, the only possible way to achieve average 
of around | for both Ri ■ y and Ci ■ y is for our game to resemble the example 
shown in Figure 1: around half of the probability mass of y must be placed on 
columns j where Ri,j « 1 and Ci.j « -|, and around half of the probability mass 
of y must placed on columns j where Rij ~ | and d,j ~ 1- 

Proposition 11 implies that it is impossible for a column j in Oi to have either 
of these properties: if Rij — 1 then d,j must be significantly smaller than -|, for 
example. This means that the amount of probability mass that y assigns to Oi 
must be very limited. The next proposition applies Markov's inequality to prove 
this fact. 

Proposition 12. Ifi is a q-bad row, then J^jed V] — T~5~ z ' 

Proof. Consider the random variable T = | + z — Rij — dj, where i is fixed 
and j is sampled from y. From Equation (5), we have that T takes values in the 
range [0, | + z]. Utilizing Proposition 9, part 1, along with Equation (7) gives 
the following: 

Riy + dy > ^ + (i - 2q)z. 
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Therefore, we have the following expression for the expectation of T: 
E[T] = ^+ z -E^ y [R iJ + C id ] 

< \ + z - (t + ( 1 ~ 2q]Z ) =2qZ 

By Proposition 11, for each j G Oi, we have Rij + Cij < 1 + 3z. Hence, we 
have T > | + z — (1 + 3z) = | — 2z for each j G Oi. Therefore, we must have 
Pr(T > | — 2z) > J2jeOi YJ- Applying Markov's inequality completes the proof: 

1 N E\T] 2qz 
Pr(T>--2z)<^-<- 



3 ' - i - 2z ~ | - 2z 



□ 



Proposition 12 shows that, if a row i is 0-bad, then y cannot assign any 
probability at all to Oi. As would be expected, as the value of q increases, the 
amount of probability that can be assigned to Oi increases. 

We now prove the second assertion: that the split between Bi and Si should 
be roughly equal. The following two propositions provide a lower bound on the 
amount of probability mass that y can assign to the columns in B, h and Si, 
respectively. 

Proposition 13. If i is a q-bad row, then . „, y,, > 3+Z qZ ^ : ' eo ' Yj . 

■* 1 3 Z 

Proof. Since the sets Bi, Si, and Oi are disjoint, we can write Definition 10 as: 
2 

E • v ' />> < + E ' + 2 • y .' /l ''- ' > 3 + 2z - ^. 

ie-B, ieSi jaOi 

We know that J?jj < 1 for each j £ Bi, that JJjj < | + 2z for each j G Oi, and 
that < | + z for each j G S^. Therefore we obtain the following inequality: 

1 • E y> + u + z ) • E y? + U + 2z ) ■ E y^ I + 2z - 

Furthermore, since ^2 jeS . yj = 1 - Ejes, Yj - E je o s Yj' we have: 



gz. 



E yj + (l +z ) ■ 1 - E y> - E yj + u + 2z ) E ^ > ^+ 2z 

Rearranging this gives: 

(!"*)"5 yj -s + *" g * _ (M£ y '- 
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Finally, this allows us to conclude that: 



\ + z-qz-{\ + z) J2 je0l Vj 

2.^1 y j - 2 



□ 



Proposition 14. If i is a q-bad row then J2jeSi 50 — 



^-2 Z - gz -(|+ z )E J -g 0i y, 



Proof. Since the sets £?j, Sj, and Oj are disjoint, we can rewrite Equation (7) as: 
2 

X! -v.' 7 ''' - + E >..' /,> ',' + X! • v - /,> './ yr •- '/•-• 

ieBi j&s z jaOi 

We know that Cij < 1 for each j £ Si, that C, ; j < | + 2z for each j £ Oi, and 
that Cij < | + z for each j G Bi. Therefore we obtain the following inequality: 

1 • E ys + (\ + z ) ■ E yj + (\ + 2z ) • E ^ > ? - « - «*■ 

Furthermore, since £ jeB . y^ = 1 ~ Ejes* yj" ~ E je o s ^ we have: 



£ y ' + (H'( 



1 - E ^ - E ^ I + ( I + 2z ) E I - z - ^. 



3 



Rearranging this gives: 



Finally, this allows us to conclude that: 

\-2z-qz-{\ + z)Y dje0i yo 



E 



jes, 3 



□ 



Recall that, for a 0-bad row i, we will have EjeOi 30 = ^- Therefore, the 
inequalities given by these propositions imply that approximately \ of the prob- 
ability mass of y must be placed on Bi and approximately ^ of the probability 
mass of y must be placed on Si. Again, as q increases, this bound gets progres- 
sively worse. 
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9 An upper bound on Ri • y im P for a row i 

In this section, we prove an upper bound on Ri ■ y lmp , for a given row i. We 
begin by decomposing the expression based on the columns of row v. 

Ri ■ r mp = E /,',., • y ; mp + E • y7 p + E R ^ • 

jeB, jes-i jeo T 

Recall that, when constructing y 4mp , we moved all probability from the 
columns in to the columns in Sf. Therefore, by definition, for each column 
j G Bj, we have that yj np — 0. Moreover, we did not modify the probability as- 
signed to the columns in O f . Therefore, for each column j G Oj, have y* mp = yj. 
Thus, we can rewrite our expression for Ri ■ y lm P as follows: 

W mp = £ i ^-yr ,+ E*M-y;- (9) 

Our goal is for our hnal bound to only rely on y, and not y lm P. Therefore, 
for the columns j G SV, we need an upper bound for y lm P in terms of y. The 
next proposition gives such a bound. 

Definition 15. Let 

Proposition 16. For all j G we /lawe: 

y * mp < 4>{ z ,q) ■ yj. 

Proof. By definition we have, for each j G Sj\ 



y" n " = yj- v v 



y? ' Sjes 5 y? 



Proposition 14 implies that J2 je s, ^ ' ~ ~ ( j^ z) Ejeo ^ yj . This allows us 



to conclude that: 



E y j ; = 1 - E y^- E y j 

. , S - 2z - Q z - (I + z ) EjeO, y^ 

^ 1 t— y^ 

2 

| - z - \ + 2z + qz + E j£0i y^ 



3 Z 



i + z + + E jeo , yj 
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Hence, we can use Proposition 12 to conclude: 



-2z-qz-(- s +z) T, je o, yj 



\ + z + qz + 4^ 



-2z 

\-2z-qz-{\ + zj^_ 2: 



= <l>(z,q) -y r 



Using Proposition 16, we can rewrite Equation (9) to obtain the following 
upper bound: 

Ri ■ y imp <<f>(z,q)-J2 R iJ ■ yj + E Rid ■ >'■*■ 

Recall that Bi, Si, and Oi are a partition of the columns in row i. Therefore, 
for the columns in SV, we have the following equality: 

E Ri i ' y J = E Ri 'i ' y J + E Ri 'i ' y J + E Ri >i ' 
jeSi jeSinBi jeS^nSi jeS\nOi 

By definition, we have Rij < 1 for each column j g £>i, we have i2jj < \ + z 
for each column j G Si, and we have Rij < (§ + 2z) for each column j e Oj. 
Therefore, we can rewrite our upper bound as: 

E '>'■■* ■ >■* E yj+ E + + E i\ + ^)-y 3 

jeSi jeSitiBi jeSinSi jes t nOi 

We can perform the same procedure for the columns in Of to obtain our final 
bound: 

Proposition 17. For every row i, we have: 

R t -y^<4>(z,q)[ y>+ E (l+*)-yi+ E (l + 2 *)-y J -] 
+ E yj-+ E (\ + z )-yj+ E (^ + 2z )-y.- 
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10 An upper bound on Ri • -y lTn P for all qr-bad rows i 



In the previous section, we showed an upper bound on Ri ■ y lm P for a specific 
row i. In this section, we will show, for a fixed q, a bound for all q-b&d rows. 

In the previous section, our upper bound depended on the amount of prob- 
ability that y gives to the columns in row i: the upper bound only depended on 
the partition (B^, SV, Oj) of the columns in row T, and the partition (B i7 Si,Oi) 
of the columns in row i. In particular, the upper bound used the intersections of 
these partitions. The following diagram shows the decomposition of a row i into 
nine possible intersections: 



Row i 



Row i 





Si 






B l 


St 


o t 


B l 


Si 


o t 


B t 


Si 





When we consider all possible q-bad rows, we cannot know the precise amount 
of probability that y assigns to each of the sets in the decomposition. However, 
recall that in Section 8, we proved inequalities on the sets used in the decompo- 
sition. Thus, we can use these inequalities to write down a linear program that 
characterises all possible g-bad rows. 

Our LP will have one variable for each of the sets in the decomposition. The 
idea is that each variable should represent the amount of probability that y 
assigns to that set. Thus, we have nine variables: dbb, db Sl dt, , d s b, d ss , d SOl d b, 
d OS} and d QOl where the variable dbb represents Sjes T ns»yj' tne var i a ble db s 
represents X^es-ns 50' i anc ^ so orL - ^ or ^ ae sa ke of convenience, we use ^ d^ as 
a shorthand for dbb + dbs + dbo, a nd ^ d*b a s a shorthand dbb + d s b + d b- We also 
use J2d*s, J2do*, a nd J2d*o, which have analogous definitions. Finally, 

we use J2 d** as a shorthand for J^db* +J2 d s * +J2d a *- 

The LP takes three constants: z, q, and q. The inequalities of this LP are 
taken directly from Section 8, and each inequality appears twice, once for the 
row T, and once for the row i. Thus, we have the following LP: 
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/ 1 2 

Maximize: 4>(z, q) I d sh + (- + z) ■ d ss + (- + 2z) • d so 

1 2 

+ d 6 + (g+Z)-rf 0iS + (g +2z)-rf 



Subject to: V 4* > I + - - ~ ^ + Z)(E ^ (10) 

3 ~ Z 



^ ' d*s 
^ ' 4* 



\+Z-qz-{\ + z){J2d*o) 



> 3 ' " ^' / 3 ' (11) 

3 - Z 

> |-2z-gz-(| + z)(E^) (12) 



> |-2z- g z-(| + z)(E^o) (13) 

< T^T (14) 
i -22 

E d *o<yzY z (is) 

5^d„ = l (16) 
x > (17) 

We will denote the feasible region of this LP as F(z, q, q). We say that row 
i is feasible if the decomposition of row i is in the feasible region of the LP. 
More formally, let cf be a vector, where we set d\ b — J2j eB . nB . Ri.j • Yj, d l bs = 

Ejgs-nS; ' an( l 80 on - R- ow * * s feasible in the LP if and only if d l e 
F(z, q, q). Since the inequalities in this LP come directly from Propositions 12, 13, 
and 14, we have that all g-bad rows are feasible in the LP. 

We denote the optimal value of the LP as s(z,q, q). The objective function 
of the LP is the upper bound on Ri ■ y tm P that was given in Proposition 17. 
Since all q-bad rows are feasible in the LP, and since we maximize the objective 
function, the solution to this LP must give an upper bound on Ri ■ y lm P for all 
g-bad rows i. Thus, we have the following proposition. 

Proposition 18. For every q-bad row i we have Ri ■ y lm P < s(z, q, q). 



11 The matching pennies argument 

So far, we have not used the matching pennies argument. Recall that Procedure 
(2) ensures that our improvement procedure is only required to work in the 
case where there is no 2 x 2 sub-game that resembles matching pennies. In this 
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section, we show how this argument can be applied to refine the linear program 
that we introduced in Section 10. 

We begin by formally defining a matching pennies sub-game in terms of our 
decomposition. 

Definition 19 (Matching Pennies). Let i and i' be two rows, and let j and 
j' be two columns. If j £ Bid and j' £ SV n Bi, then we say that i, i' , j, and 
j' form a matching pennies sub-game. 

An example of a matching pennies sub-game can be seen below. 

\II 
I 



J 



3 



1 

3 

1 


1 

1 

3 


1 

1 

3 


1 
3 

1 



As we can see, in this example we have j £ Bill SV , and we have j £ Bi> DSi, and 
therefore this is a matching pennies sub-game. If our game contains this example 
as a sub-game, then we can obtain a i-WSNE by making the row player mix 
uniformly between i and i', and making the column player mix uniformly between 
j and /. 

In the next proposition, we generalise this property: if the row and column 
players both mix uniformly over a matching pennies sub-game, then we can 
always produce a (| - z)-WSNE. 

Proposition 20. If there is a matching pennies subgame, then we can construct 
a (| - z)-WSNE. 

Proof. Let i, i', j, and j' be the matching pennies subgame. We define two 
strategies x' and y' as follows: 



0.5 if k = i or k 
otherwise. 



Yk = 



0.5 if k = j or k 
otherwise. 



We will prove that (x', y') is a (| — z)-WSNE. Note that when the column player 
plays y', the payoff to the row player for row i is: 



Ri-y' = 0.5 • Rij + 0.5 • R 



H,j' ■ 



Since j £ Bi we have Rij > I +2z, Hence, we have: 



Ri-y' > 0.5 x 



2z ) +0.5 x = - + z. 



An identical argument can be used to argue that Ri> ■ y', C T j ■ x', and C T y ■ x' 
are all greater than or equal to \ + z. 
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Since Rky' < 1 and C^x' < 1 for all k, the largest possible regret that can 
be experienced by either of the two players is 1 — +z) = | — z. Hence, (x', y') 
is a (| - z)-WSNE. > ^ 

Due to Proposition 20, we can assume that our game does not contain a 
matching pennies sub-game, because otherwise Procedure (2) would have found 
a (|— z)-WSNE. Note that, by definition, if the game does not contain a matching 
pennies sub-game, then for all rows i we must have either Bj n Si = 0, or 
Bill = 0. If this were not the case, then we could select a column j e Bid Si, 
and a column j' £ Bid Si, which would give a matching pennies sub-game. 

We can use this fact to strengthen the linear program given in Section 10. 

Definition 21. We defin two LPs by adding an extra constraint to our existing 
LP. In the first LP we add the constraint di> s — 0, and in the second LP we add 
the constraint d s b = 0. We refer to these two LPs as Vi(z,q 7 q) and V 2 (z 7 q,q) 
respectively. 

We will use F\(z, q, q) to refer to the feasible region of V\(z, q, q), and F 2 (z, q 7 q) 
to refer to the feasible region of V 2 (z 7 q 7 q). Similarly, we will use Si(z,q, q) and 
s 2 {z, q, q) to refer to the optimal values of the two LPs, respectively. 

Since every row i must have either either B^tlSi = 0, or BiCiSi = 0, every row 
must be feasible in one of the two LPs. This implies the following strengthening 
of Proposition 18 

Proposition 22. If the game has no matching pennies sub-game, then for each 
q-bad row i we either have Ri-y lmp < S\{z, q, q), or we have Ri-y" np < s 2 (z, q, q). 

12 An upper bound on Ri ■ y zm P for all rows 

So far, we have shown a bound on Ri ■ y tm P for all q-bad rows, and this bound 
was given by the solution of the two linear programs given in Section 11. In this 
section, our goal is to combine these bounds into a simple linear function. In 
particular, we give a method for finding two constants c z and d Zl such that 

msLx(s 1 (z,q,q),s 2 (z,q,q)) < c z + d z ■ q, 

for all possible q. 

Recall that Proposition 22 implies that, if there is no matching pennies sub- 
game, then for every g-bad row i, we either have Ri ■ y lm P < si(z,q,q), or we 
have Ri-y tmp < S2(z, q, q). Therefore, by showing the above bound, we will have, 
for every g-bad row i: 

Ri ■ y mv <c z + d z - q . 

Our first step is to show that both Si(z, q, q) and s 2 (z, q 7 q) are monotonically 
increasing with respect to q. The next proposition establishes this fact. 



22 



Proposition 23. Suppose that z < g. If q\ < q 2 , then, for all q, we have both: 

si(z,qi,q) < si(z,q 2 ,q), 
s 2 (z,q 1 ,q) < s 2 (z,q 2 ,q). 

Proof. Let k £ {1,2}. We begin by arguing that F k (z,qi,q) C F fc (z, q 2 , q). We 
claim that this can be seen by inspection. For example, we can rewrite the first 
constraint as: 

(\ + z){d ob + d os + d ao ) \ + z - qz 
Ubb + O-bs + dbo H 2 — — 2 • 

3 _ Z 3 _ Z 

Clearly, increasing q will make the right-hand side of this constraint weaker. It is 
not difficult to perform the same procedure for all other constraints that involve 
q. Hence, since q\ < q 2 , we must have F k {z,qi 7 q) C F k (z,q 2 ,q). 

Next, we argue that Sk{z,q~i,q) < Sk{z,q\,q). Let obj^ 9 be the objective 
function of V k : 

objr'(d) E E (l + *)-YJ+ E (| + 2^) - ) 

\jeS,r\Bi jes T ns z jes l no l J 

+ E yj-+ E (\ + z )-yj+ E (| + 2 «)-yj- 

jeOinSi jeo.ns, jeOinOi 

Let <i G Fk(z, q~i, q) be a vector such that obj^' 91 (d) — Sk(z, q~i,q)- We argue that: 

ob# ?1 (<0<obtf*»(d). 

Note that, in the objective function, the term q only appears in (f>(z,q). Hence 
it is sufficient to argue that <p{z,qi) < <p(z,q 2 ). Again this can be verified by 
inspection: since z < g, we have that the term +q only appears in the numerator 
of 4>{z, q) and that the term — q only appears in the denominator of <p(z, q). Hence, 
we must have <p{z,q\) < 4>(z,q 2 ), which implies ohy^ qi (d) < obj^' 92 (d). 

Finally, we combine this with the fact that d is feasible in both LPs to 
conclude: 

Sk(z,qi,q) =obj^ 9l (d) 

<obj*' 92 (d) <S k (z,q 2 ,q). 



Since we know that q takes values in the range < q < 3, Proposition 23 
implies that, if we show 

max(si(z, 3, q), s 2 (z, 3, q)) < c z + d z ■ q, 

then we have shown that 

max(s 1 (z,q,q),s 2 (z,q,q)) < c z + d z ■ q, 
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for all possible values of q. 

Next, we show that each of the individual LPs can be bounded by a linear 
function. For each k G {1, 2}, we show that Sk(z, 3, q) < c z .k + rf z ,fc ■ q- Note that, 
if we write Vk in standard form ma,x x {c T x : Ax = b, x > 0}, then q only appears 
in the right-hand side b. Using this, along with the fact that Vk is a maximization 
problem, allows us to apply standard results to argue that Sk(z, 3, q) is a concave 
piecewise-linear function with respect to q (see Appendix B.6 of [1] or [5].) Since 
the function is concave, we can obtain our upper bound by setting c Zt k + d z .k • q 
to be the first piece of this function. 

Therefore, we set c Zt k and d z .k to be the piece of Sk(z, 3, q) for q = 0. If two 
pieces meet at q = 0, then we select the right-hand piece. This can be done as 
follows: 

- We set c z . k = s fe (z,3, 0). 

- To find d Zi k, we use standard sensitivity analysis techniques. Write Vk(z, 3, 0) 
in the standard form: 

max{c T i : Ax = b, x > 0} . 

X 

The matrix A has eight rows: the first seven rows correspond to Constraints (10) 
through (16), and the final row corresponds to the matching pennies con- 
straint added in Definition 19. Let Ab be the following perturbation column 
vector, which like b has dimesnion eight: 

if i is 2 or 4, 
^ if i is 6, (18) 
otherwise. 

Note that q only appears in Constraints (11), (13), and (15), which corre- 
spond to the second, fourth, and sixth rows of A. It can be seen that Ab 
contains the coefficients of q, for the constraints in which it appears, and 
for the constraints that do not contain q. Let Vk be the dual LP of Vk(z, 3, 0) 
and let V* k be the optimal set of Vk- We can then obtain d Zy k by solving the 
following LP: 

d Zt k = min{Ab T y :yeV*} . 

y 

In Section 3 and Section 4.2 of [5] it is shown that d z _k is the right-derivative 
of Sk{z,3,q) at q = 0. Therefore, this approach is correct. 

Since we defined c z .k + d z _k ■ q to be the right-hand piece of Sfc(z, 3, 0), and 
since Sk(z, 3, q) is concave, we have the following proposition. 

Proposition 24. We have Sk(z, 3, q) < c Zt k + d z ,k ■ q ■ 

So far, we have treated the two LPs separately. To conclude this section, 
we combine the two bounds. To do this, we simply take the maximum over the 
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bounds that we have shown so far. More precisely, we set: 



c z = max(c Zi i,c Zi2 ), 
d z = max(d Z) i,d z ,2). 



It is then clear that: 



max(si(z, 3, q), s 2 (z, 3, q)) < c z + d z ■ q. 



This then gives the main result of this section. 

Proposition 25. If there is no matching pennies sub-game, then for every q-bad 
row i we have Ri ■ y mip < c z + d z ■ q. 

13 The test for P(z) 

Recall that y(t) is the convex combination (1 — t) -y + t -y 4mp as defined in (8). In 
Section 5, Definition 6, we defined the property P{z) which holds if there exists 
a t such that Ri ■ y(t) < | — z, for all rows i. In this section, we develop a test 
that proves that P(z) holds for a restricted range of z. This test will use the fact 
that, if there is no matching pennies sub-game, then for every g-bad row i, we 
can bound Ri ■ y tm P by the linear function c z + d z ■ q. Therefore, for the rest of 
the section we fix c z and d z to be the constants described in Section 12. 

The procedure starts by finding t*. This is defined to be the smallest value 
of t for which, if i is a 0-bad row, then Ri ■ y(t) < | — z. By definition we have 
that Ri ■ y = | + 2z, and we also know that Ri ■ y" np < c z + d z ■ 0. Therefore t* z 
is the solution of: 



(- + 2z)-(l-t* z ) + c z -t* z = --z. 



This can be seen graphically in the following figure: 

Ri ■ y(t) 

§+2z-. 




■> t 



The line in the figure starts at | — z when t — 0, and ends at c z when t = 1. 
The point t* z is the value of t at which this line crosses | — z. We can solve the 
equation to obtain the following formula: 



t* - — 



l + 2z-c z 



(19) 



25 



For each g-bad row i, we have a trivial bound of 



R l -y mp < l. 



(20) 



Note that if q is large, then this bound will be better than our bound of c z + d z -q. 
The next step of our procedure is to find q*, which is the smallest value of q 
such that, using this trivial bound (20), we can conclude that Ri ■ y{t* z ) < | — z. 
Formally, we define q* to be the solution of: 

{\ + 2z-q* z z)-{l-t* z )+t* z = 2 --z. 
This can be seen diagrammatically in the following figure: 




The figure shows that we fix a line with gradient 1 that passes through | — z 
when t — t*. Then, q* is defined to be the point at which this line meets the 
y-axis of the graph, where t = 0. Solving the equation gives the following formula 
for q*. 

... (2z-l)-f,-3z 



Q z = 



zt* 



(21) 



For rows i that are g-bad for q > q*, we can apply the trivial bound (20) to 
argue that Ri ■ y(t* z ) < \-z. Therefore, we need only be concerned with rows 
i that are g-bad with < q < q*. The next proposition gives a test that can be 
used to check whether all such rows will have the property Ri ■ y(t* z ) < | — z. 

Proposition 26. Suppose that there is no matching pennies sub-game. If 

c z + d z -q* z < 1, 



then Ri ■ y(t z ) < | — z for all rows i. 



2G 



Proof. Suppose that i is a g-bad row. We begin with the case where q > q*. In 
this case, we have: 

Ri ■ y(t* z ) = (| + 2z - qz)(l - t* z ) + Ri ■ y m ? ■ t;. 

Since we have Ri ■ y tm P < 1, and we have q > q*, we can obtain: 

Ri-y(t* z )<(^ + 2z-qz)(l-t* z ) + t* z 

<(l+2z-q*z)(l-t* z )+t* z = ^-z. 

We now consider the case where q <q* z . Once again we begin with: 

Ri ■ y(t* z ) = (| + 2z - qz)(l - t* z ) + Ri ■ y" n ? ■ t;. 

Proposition 25 implies that Ri ■ y %mv < c z + d z q, and by assumption we have 
that c z + d z q < 1. Hence, we have 

Ri ■ y(t* z ) < (| + 2z - qz)(l - t* z ) + (C Z + d z q) ■ t*. 

Note that this expression is linear in q. When g = 0we have, by the definition 
oft* 

(| + 2z - qz)(l - t* z ) + (c z + d z q) -t* z = {\ + 2z)(l - t* z ) + c z -t* z 

2 

On the other hand, when q = q*, we can use the assumption that c z + d z q* < 1, 
and the definition of q* z to obtain: 

(| + 2z - qz)(l - t* z ) + (c z + d z q) ■ t* = (| + 2z - q* z z)(l - t*) + (c z + d z q* z )t* z 

<(l + 2z-q* z z)(l-t* z )+t* z 
2 

Hence, we have shown the following inequality for the points q = 0, and q = q*. 
(| + 2z - qz)(l - f*) + (c z + d z q) ■ t* z < H - z. 

Since the expression is linear in q, it follows that we have the same inequality 
for all q in the range < q < q*. This allows us to conclude, for the case where 
< q < q* z , that: 

Ri-y(t* z ) <l~z- 

a 
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14 Proof of Theorem 7 



Obviously, we can also apply all of our reasoning to the strategy x of the row 
player, simply by swapping roles of the two players. Hence, the strategy x(t), for 
t in the range < t < 1 is well defined, and Proposition 26 holds for x(t*). 

To test whether we can construct a (| — z)-WSNE for a given constant z < g, 
we do the following. First we find c z and d z . Then we compute t* and q* using 
Formulas (19) and (21). Finally, we test whether c z +d z -q* z < 1. If this inequality 
holds, then we have that (x(i*), y(t*)) is a (§ - z)-WSNE. 

Proposition 27. If there is no matching-pennies, then (x(t*), y{t*)) is a (| — 
z)-WSNE, with z = 0.004735. 

Proof. It is simple to verify, through computation, that when z = 0.004735 we 
have c z + d z q* z < 1. Hence, when z = 0.004735, we can apply Proposition 26 to 
prove the following two statements. 

— If there is no matching pennies sub-game, then Ri ■ y(t* z ) <% — z, for all i. 

- If there is no matching pennies sub-game, then Cj ■ x(i*) < ^ — z, for all j. 

Hence, we have shown that the maximum possible regret that can be suffered 
by either of the two players in (x(t*), y(t* z )) is | — z. Therefore, we have shown 
that (x(i*),y(i*)) is a (§ - z)-WSNE. " □ 

The value of z used in Proposition 27 is close to the best that we can achieve, 
because when z = 0.004736 we have c z + d z q* z > 1. This completes the proof of 
Theorem 7. 
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